Sparse Learning for Large-Scale and High-Dimensional Data: A Randomized Convex-Concave Optimization Approach
نویسندگان
چکیده
In this paper, we develop a randomized algorithm and theory for learning a sparse model from large-scale and high-dimensional data, which is usually formulated as an empirical risk minimization problem with a sparsity-inducing regularizer. Under the assumption that there exists a (approximately) sparse solution with high classification accuracy, we argue that the dual solution is also sparse or approximately sparse. The fact that both primal and dual solutions are sparse motivates us to develop a randomized approach for a general convex-concave optimization problem. Specifically, the proposed approach combines the strength of random projection with that of sparse learning: it utilizes random projection to reduce the dimensionality, and introduces l1-norm regularization to alleviate the approximation error caused by random projection. Theoretical analysis shows that under favored conditions, the randomized algorithm can accurately recover the optimal solutions to the convex-concave optimization problem (i.e., recover both the primal and dual solutions).
منابع مشابه
Finding sparse solutions to problems with convex constraints via concave programming
In this work, we consider a class of nonlinear optimization problems with convex constraints with the aim of computing sparse solutions. This is an important task arising in various fields such as machine learning, signal processing, data analysis. We adopt a concave optimization-based approach, we define an effective version of the Frank-Wolfe algorithm, and we prove the global convergence of ...
متن کاملMammalian Eye Gene Expression Using Support Vector Regression to Evaluate a Strategy for Detecting Human Eye Disease
Background and purpose: Machine learning is a class of modern and strong tools that can solve many important problems that nowadays humans may be faced with. Support vector regression (SVR) is a way to build a regression model which is an incredible member of the machine learning family. SVR has been proven to be an effective tool in real-value function estimation. As a supervised-learning appr...
متن کاملEstimation and Selection via Absolute Penalized Convex Minimization And Its Multistage Adaptive Applications
The ℓ1-penalized method, or the Lasso, has emerged as an important tool for the analysis of large data sets. Many important results have been obtained for the Lasso in linear regression which have led to a deeper understanding of high-dimensional statistical problems. In this article, we consider a class of weighted ℓ1-penalized estimators for convex loss functions of a general form, including ...
متن کاملSparse estimation of high-dimensional correlation matrices
Estimating covariations of variables for high dimensional data is important for understanding their relations. Recent years have seen several attempts to estimate covariance matrices with sparsity constraints. A new convex optimization formulation for estimating correlation matrices, which are scale invariant, is proposed as opposed to covariance matrices. The constrained optimization problem i...
متن کاملPicasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python
We describe a new library named picasso, which implements a unified framework of pathwise coordinate optimization for a variety of sparse learning problems (e.g., sparse linear regression, sparse logistic regression, sparse Poisson regression and sparse square root loss linear regression), combined with efficient active set selection strategies. Besides, the library allows users to choose diffe...
متن کامل